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Abstract — As the process technologies scale into deep sub- 
micron region, crosstalk delay is becoming increasingly severe, 
especially for global on-chip buses. To cope with this problem, 
accurate delay models of coupled interconnects are needed. In 
particular, delay models based on analytical approaches are desir- 
able, because they not only are largely transparent to technology, 
but also explicitly establish the connections between delays of 
coupled interconnects and transition patterns, thereby enabling 
crosstalk alleviating techniques such as crosstalk avoidance codes 
(CACs). Unfortunately, existing analytical delay models, such as 
the widely cited model in [1], have limited accuracy and do 
not account for loading capacitance. In this paper, we propose 
analytical delay models for coupled interconnects that address 
these disadvantages. By accounting for more wires and eschewing 
the Elmore delay, our delay models achieve better accuracy than 
the model in [1]. 

Index Terms — Crosstalk, interconnect, delay, bus 



I. Introduction 

Crosstalk caused by coupling capacitance between adjacent 
wires leads to additional delay to multi-wire buses. As the pro- 
cess technologies scale into deep submicron region, coupling 
capacitance between adjacent wires and hence crosstalk delays 
increase greatly. According to the International Technology 
Roadmap of Semiconductors (ITRS) [2], gate delay decreases 
with scaling, while global wire delay increases. Hence, the 
crosstalk delay problem is becoming increasingly severe in 
VLSI designs, especially for global on-chip buses, and will 
become the performance bottleneck in many high-performance 
VLSI designs. 

This paper focuses on analytical delay models applicable to 
general coupled interconnects. Although various delay models 
of interconnects have been proposed in the literature (see, 
for example, [1], [3]-[ll]), few are comparable to our work 
in this paper Some delay models (see, for example, [3]- 
[5], [7], [9], [11]) do not consider crosstalk from adjacent 
wires. Furthermore, most previously proposed delay models 
are based on numerical approaches (see, e.g., [3]-[5], [7]- 
[11]). They can achieve high accuracy, but they have several 
drawbacks. First, they sometimes lead to lookup tables of 
delays from any initial state to any next state (see, for 
example, [11]), which are often bulky and difficult to obtain 
and use. Second, numerical approaches in [3]-[5], [7], [9] are 
technology-dependent and their delays often depend on many 
parameters. Hence these approaches have poor portability 
and are not applicable to general cases. Third, the delays 
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obtained by the numerical approach offer little insight, and are 
not conducive to technology-independent crosstalk alleviation 
techniques such as crosstalk avoidance codes (CACs) (see, for 
example, [12]-[15]). Fourth, numerical approaches often have 
very high complexities. In contrast, analytical approaches are 
advantageous in these aspects. Analytical approaches depend 
on few technology parameters, and hence they are largely 
technology independent. Furthermore, analytical approaches 
illustrate the connection between delays of coupled intercon- 
nects and transition patterns, thus enabling us to design CACs. 
Finally, analytical approaches have very low computational 
complexities. A widely cited analytical delay model proposed 
by Sotiriadis et al. [1], [6], which uses the similar methodology 
to that in [16], appears to be the most comparable previous 
delay model to our work in this paper 

Based on the model in [1], [6], the delay of the fc-th wire 
(A; e {1, 2, • • • , m}) of an m-bit bus is given by 



Tk = 




AA1A2] 



2A)A2-AAfc(Afc_i 
A)A2^ - AA„A„_i] 



k^ 1 
Afe+i)], k^l,m 

k — m, 

(1) 

where A is the ratio of the coupling capacitance between 
adjacent wires and the ground capacitance of each wire, tq 
is the intrinsic delay of a transition on a single wire, and A^ 
is 1 for — >^ 1 transition, -1 for 1 — > transition, or for no 
transition on the fc-th wire. We observe that in this model, the 
delay of the fc-th wire depends on the transition patterns of 
wires fc — 1, fc, and fc + 1 only. Since all possible values of Tk 
in Eq. (1) are (1 + zA)to for i G {0,1,2,3,4}, all transition 
patterns on wires fc — 1, fc, and fc + 1 can be divided into five 
classes according to their corresponding i. These five classes 
are denoted as iC for i G {0,1,2,3,4} (this classification 
was also used in [12]). Based on this model, various CACs 
(see, for example, [12]-[15]) have been proposed, based on the 
central idea of achieving a reduced delay by limiting transition 
patterns over the bus, at the expense of additional wires. 

However, the model in [1] have two significant drawbacks. 
First, the model in [ 1 ] has limited accuracy. In a bus with more 
than three wires, the simulated wire delay for OC transition 
patterns is much larger than tq, the delay of OC given by 
(1). This implies that the scheme that uses two shield wires 
with the same transition to achieve a delay of tq (see, for 
example, [17]) will be ineffective. Our simulation results also 
show that the delays of other classes of transition patterns 
given by Eq. (1) have limited accuracy as well. This is partially 
because of the model's dependence on only three wires. Also, 
the model in [1] uses in its derivation the Elmore delay, which 



tends to overestimate the delay [18], [19]. 

The second drawback of the model in [1] is that it does not 
account for the loading capacitance. It has been shown that the 
loading capacitance is crucial in real practice and can affect 
the total delay for all patterns. 

Addressing these disadvantages for the model in [1], in 
this paper we propose analytical delay models for coupled 
interconnects. Our delay models first derive closed-form ex- 
pressions of the signals on the bus via a distributed RC model, 
and then approximate the wire delays by evaluating these 
closed-form expressions. Our delay models differ from the 
model in [1] in three aspects. First, in our delay models, we 
eschew the Elmore delay used in the model in [1]. Then, we 
consider either three wires or five wires in our delay models 
for improved accuracy. Due to these two differences, our 
models have significantly improved accuracy than the model 
in [1]. Finally, we take into account the buffer effects (driver 
resistance and loading capacitance). Our delay models also 
maintain the simplicity of the model in [1], and the transition 
patterns are divided into several categories based on their 
delays. Hence, our delay models are easy to use and conducive 
to the design of CACs. Although our delay models consider 
adjacent three and five wires in this paper, our models are 
applicable to buses of any number of wires. 

Simulation results show that our delay models offer sig- 
nificant advantages than the model in [1]. Our simulations 
results fall into three scenarios. First, we compare the delays 
produced by our model and the model in [1] with the simulated 
delays for three- or five-wire buses. This is motivated by partial 
coding schemes (see, e.g., [12], [13], and [14]), which divide 
a wide bus into sub-buses with a few wires and separate them 
by shielding wires. Second, we obtain extensive simulation 
results for 17- and 33-wire buses assuming arbitrary transition 
patterns. Third, we assume the transition patterns are limited 
to those of CACs. In all three scenarios, our five-wire delay 
model is much more accurate than the model in [1]. 

With the scaling of technologies, the inductance is becoming 
significant and impacts the signals on the bus greatly. Due to 
the coupling effect of inductance, the worst-case patterns for 
an RLC modeled bus are quite different from that of an RC 
modeled bus [10]. Hence, the CAC design methodology would 
change greatly due to the inductance effect. However, our 
delay models do not consider the inductance effect for two rea- 
sons. First, it seems difficult to derive a closed-form expression 
of the signals on the bus based on the RLC model. Hence, our 
methodology cannot be easily adapted from the distributed RC 
model to an RLC model. Second, according to the criteria in 
[22], the inductance effect is significant in some cases, but are 
negligible in other cases. Specifically, the range of significance 

of the inductance effect is given by -%= < x < -\ - [22], 
where x is the length of the wire, tr the input transition time, 
and r, I, and c the resistance, inductance, and capacitance per 
unit length, respectively. According to [23], the inductance 
effect is not negligible for very deep submicron technologies 
and extremely long wires. In current industry applications, the 
on-chip inductance effect is still insignificant. This conclusion 
was also confirmed by other works: the 16-bit, 32Gb/s, 5mm- 



long bus and 8-bit, 16Gb/s, lOmm-long bus in [24] show that 
the distributed RC model is sufficiently accurate for these high- 
speed long interconnects. In our work, our delay models are 
derived based on 5mm-long buses under a 45nm technology, 
where inductance effect is negligible. 

The rest of the paper is organized as follows. In Section II, 
we propose our delay models. The delay models are also 
modified to account for the buffer effects. In Section III, 
we present extensive simulation results for our delay models. 
Concluding remarks are provided in Section IV. 

II. Delay model 

In this section, we first present the system model, where 
switching instants of all wires in the bus are assumed simul- 
taneous. For three-wire and five-wire buses, we then derive 
closed-form expressions for outputs of the bus, and finally 
approximate their delays and compare them with those by the 
model in [1]. 

A. System model 

In this paper, we focus on global interconnects connecting 
different modules for communication, such as data and address 
buses, and use the distributed RC model for interconnect 
modeling. For simplicity, we assume regular interconnects, 
which have uniformly distributed parameters and are paralleled 
routed in the same metal layer without turnings. Hence, the 
interconnects are modeled as transmission lines, which can 
be characterized by the telegrapher's equations. For complex 
interconnect structures with jumps and turnings, additional 
resistance due to vias and unequal length of wires should 
be included, which makes the interconnect behavior more 
complicated. However, crosstalk delays are expected to in- 
crease due to the additional resistance. The partially coupled 
buses are more complex and hence are not considered in this 
paper. We plan to investigate this in our future work. The ca- 
pacitance between non-adjacent wires is negligible compared 
with capacitance between adjacent wires, since the capacitive 
coupling effect is a short range effect [16]. The distributed RC 
model is often used to approximate the buses [25]. Although 
the closed-form expressions of the signals on the bus via 
a distributed RC model are sums of infinite terms, usually 
sums of the two most significant terms provide a very close 
approximation of signals on the bus [26]. 

The distributed RC model of an m-wire bus is shown 
in Fig. 1, where Vi{x,t) denotes the transient signal at a 
position X along wire i for i S {1, 2, • • • , m}, r and c denote 
the resistance and capacitance per unit length, respectively. 
Also, Ac denotes the coupling capacitance per unit length 
between two adjacent wires. The output resistance of a driver 
is approximated as a linear resistor, Rs, and the loading due 
to a receiver is modeled as a capacitance, Cl- In this work, 
we focus on a uniformly distributed bus and hence assume the 
parameters r, c, and A are the same for all the wires. 

We use the 50% delay, which is defined as the time differ- 
ence between the respective instants when the input signal and 
corresponding output signal cross 50% of the supply voltage 
Vdd- According to [27], the delays of global interconnects are 
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Fig. 1. A distributed RC model of an m-wire bus. 



sHghtly affected by the slew rate. Since this work focuses on 
global interconnects, we ignore input slew and assume ideal 
step signals are applied on the bus directly. In this paper, we 
use the same classification iC for i — 0,1,2,3,4 in [1] and 
focus on the worst 50% delay of any wire for all classes to 
formulate our delay models. We consider the closest neighbors 
for crosstalk, since farther wires have weaker coupling effects. 
In Section II-B, we first focus on internal wire, wire 2 in a 
three-wire model, to account for most adjacent two wires (one 
wire to the left and one to the right). In Section II-C, we focus 
on internal wire, wire 3 in a five-wire model, to account for 
most adjacent four wires (two wires to the left and two to the 
right). In Section II-D, we derive the delay for boundary wires, 
wires 1, 2, 4, and 5 in a five-wire model. In Section II-F, we 
show how to identify the worst-case delays among all wires 
for a wide bus via a shift window scheme. 

In this section, we first derive delay models by assuming that 
the buffer effects (driver resistance and loading capacitance) 
are negligible. This is an important case since the propagation 
delay is characterized only by the distributed interconnects. 
Then, in Section II-E, we modify the delay models to account 
for the buffer effects, which are crucial in real practice. It has 
been shown that the buffer effects would increase the total 
delay for all patterns. 

Below we first investigate the case m — 3 and then 
extend our results to the case m = 5. There are two reasons 
for studying the three-wire model. First and foremost, the 
derivation of our five-wire model is based on the three-wire 
model. Second, our three-wire model is more accurate than 
our five-wire model for buses with only three wires, which 
is of interest for partial coding schemes (see, e.g., [12], [13], 
and [14]). We use T,*p to denote the worst delay of the middle 
wire (wire ^^^^^) of an m-wire bus for all iC patterns. 



B. Internal wires for three-wire model 

In [26], the crosstalk of two coupled lines was described 
by partial differential equations (PDEs), and a technique for 
decoupling these highly coupled PDEs was introduced by 
using eigenvalues and corresponding eigenvectors. Using the 
same technique as in [26], we obtain the differential equations 



dx 



^Y{x,t) 



RC^V(x,t), 



(2) 



where Y{x,t) = [Vi{x,t) V2{x,t) ^3(0;,^)]^ and Vi(x,t) 
denotes the voltage of wire i at distance a; (0 < a; < L) 
at time t for i = 1,2, 3, R = diag{r r r}, and C = 
1 + A -A ~ 

-A 1 + 2A -A 
-A 1 + A ^ 

Trhe boundary conditions are given by 

v.io,t)^vr-ivr-v/)u{t) 

h{L,t) = Q 

where V[ and V^ denote the initial and final voltages of the 
transition on wire i, respectively. 

We find the three eigenvalues of C/c, pi = 1, P2 = (1 + A), 
and P3 — {1 + 3A), and their corresponding eigenvectors e^'s, 
[111]"^, [10-1]^, and [-12-1]"^, respectively. Hence, Eq. (2) 
is transformed to 



for i = 1,2,3 



92 d 

;Ui{x,t) == rcpi-—Ui{x,t) for i 



1,2,3, 



(3) 



dx"^ 

where Ui{x,t) = V^(a;,i)ei for i = 1,2,3. So Ui{x,t) = 
Vi{x,t) + V2{x,t) + V:i{x,t), U2{x,t) = Vi{x,t) - Vsix^t), 
and Usix, t) = 2^2 (x, t) - Vi (x, t) - Vsix, t). 

Applying Laplace transform on both sides of Eq. (3), we 
have 

-^U^{x,s)^rcp^[sU^{x,s)~U^{x,0)] for 1 = 1,2,3. (4) 

Using appropriate initial conditions, we solve Eq. (4) for 
U^{x,t) and obtain V2{L,t) = \[Ui{L,t) + U-i{L,t)]. By 
solving V2(i, t) = 0.5Vdd, we can approximate the 50% delay 
of a three-wire bus for different transition patterns. 

In this paper, we use "f to denote a transition from to 
the supply voltage Vdd (normalized to 1), "-" no transition, 
and "4," a transition from Vdd to 0. 

For OC pattern t^^, the output of wire 2 is given by [26] 



v2{L,t) = i+j: 



i-ir 



f(2n-l)' 



.-i(2«-l)2 



where tq — 



£L1 
2 



and 



For the 50% delay, keeping only the first exponential term 
is accurate enough. So we have V2{L,t) + 1 — -e~~. 
Similarly, we keep only the first exponential term as the 
solution for other cases. Solving V2{L,t) = 0.5, we have 
Tg '^ + (l'^ f ) '''■ Similarly, the closed-form expressions of 
wire 2 and approximate delays for other classes are derived 
and summarized in Table I, where T^'~'' the approximate delay 
for iC pattern by our three-wire model. 



C. Internal wires for five-wire model 

To further improve the accuracy of delay, we include two 
extra adjacent wires to approximate the delay by considering 
the influences of all five wires. Each wire has three kinds 
of transition: 'f, -, and J,. Hence, for such a five-wire bus, 
there are 3^ transition patterns. To maintain the simplicity of 
our models, we still divide them into five classes (iC, i G 
{0, 1, 2, 3, 4}) based on the transition patterns of middle three 



TABLE I 
Closed-form expressions of signal on wire 2 and approximate 

DELAYS IN A THREE-WIRE BUS (V2(i,i) = 1 - ^le ^ - A2e TT+3>7^ , 



TABLE II 

Decomposition of worst-case patterns in the five- wire model. 



iC 


Worst 
Pattern 


Coeff 


. of y2{L,t) 


rpiC 


Ai 


A2 


oc 


ttt 


A 





(Inf)r 


IC 


tt- 


8 


4 


(ln|)r 


2C 


-t- 


4 




(l„«)(l + 3A)r 


3C 


it- 





1 


(ln^)(l + 3A)r 


4C 


m 


4 
"Htt 


iii 


(ln#)(l + 3A)r 



wires (wires 2, 3, and 4). Hence, there are nine different 
transition patterns for each pattern of the same class. 

Since the interconnect is a Unear system, any pattern can 
be decomposed into a combination of patterns with transitions 
on a single wire. For example, ttti" is decomposed as (f- - 

- -) + ("t ) + (- -t- -) + ( i-)- The delay expression 

of the middle wire impacted by any pattern is given by a 
summation of effects of individual wires on the middle wire. 
However, this approach would result in expressions that are 
hard to analyze. Instead, we propose to group these individual 
wires to form some special patterns, which can be analyzed 
easily. 

Definition 1: Reducible transition pattern (RTP) 
An RTP in the five-wire model is defined as a transition 
pattern that can be reduced to a transition pattern in the three- 
wire model. The set {ttttt, iiiii, 144, 144} is the set 
of RTFs for the five- wire model. 

For the transition m^| (similarly for mil), all wires 
have the same transitions. There are no coupling capacitance 
between any two adjacent wires. So the expression of wire 
3 is approximated by Vz{L^t) = 1 — -e^~ and the delay is 
approximated by (in f) t. For the transition i-f-i (similarly 
for f-i't), wires 2 and 4 can be approximated as ground 
wires in the five-wire bus, since wire 1 (or 5) and wire 3 
have opposite transitions. For wire 3, the five-wire pattern is 
equivalent to a three-wire pattern \.'\!^, where the equivalent 
coupling capacitor between wire 1 (or 5) and wire 3 is 
equal to two capacitors in series between wires 1 and 2, 
and wires 2 and 3 (or wires 3 and 4, and wires 4 and 5). 
Hence, the equivalent coupling factor between wire 1 (or 
5) and wire 3 is approximated as -^ per unit length (that 
is, the ratio of the coupling capacitance and the loading 
capacitance is -I 



^). The expression of wire 3 is approximated 
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'^ , and the delay is 



Definition 2: Single transition pattern (STP) 
An STP is defined to be a transition pattern with transitions 
on only one wire. For our five-wire model, we focus on the 
set of STPs with transitions on wire 2 or 4, {-'\ , -| , 

- - 4-, - - 44- 

The expressions of wire 3 can be approximated by 
considering wires 2, 3, and 4 as a three-wire model. Let 
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VHx^t) denote the signal on wire j due to coupling from 
wire i. For example, by ignoring coupling from wires 1 
and 5 in -'\- - -, the output of wire 3 is approximated by 

Vi{L,t) ^ -376"- + ^e^TT+mr^ which is obtained by 
considering only wires 2, 3, and 4. 

We propose the following approaches to derive the delay of 
the five- wire bus. 

• We first decompose the worst pattern in each class into 
a combination of an RTP and STP(s). 

• Then we combine the expressions of the RTP and STP(s) 
for the middle wire based on the conclusion of our three- 
wire model. 

• Finally, we evaluate the expression of the middle wire to 
approximate its delay. 

Since the performance is limited by the worst-case delay 
in each class, we need to approximate the delays of only the 
worst patterns in each class. We use simulation to identify 
the worst patterns in all classes. The worst patterns for OC 
to AC are given by ItttI, Itt4, 144, tlt4, and titit, 
respectively (assuming the middle wire has an upward transi- 
tion). With RTFs and STPs, we decompose the worst pattern 
in each class as shown in Table II. 

The closed-form expressions of wire 3 and approximate 
delays for all classes in a five-wire bus are derived and 
summarized in Table III, where T^^ the approximate delay 
for iC pattern by our three-wire model. 

D. Boundary wires 

In the previous derivation, we focus on middle wires and 
consider four neighboring wires (two to the left and two to the 
right) for crosstalk. In this section, we derive delay models to 
account for the boundary wires of an m-wire bus (wires 1, 2, 
m — 1, and m). For wire 1 (wire m), we consider wires 2 and 3 



TABLE IV 

Closed-form expressions of signal on wire 1 and approximate 
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TABLE V 
Closed-form expressions of signal on wire 2 and approximate 
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TABLE VI 
Expressions of middle wire in a three- wire model, where 
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TABLE VII 

Expressions of middle wire in a five- wire model, where 

V3(L,t) = l-bs-Bse"^ -b^Bie'^ -b^Bs.e'^, 
Bi = 1.01 ^^+g^+l . B4 = 1.01 ^^+£^+; , Bs = 1.01 ^^+4+1 
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to the right (wires m— 2 and jti — 1 to the left) for crosstalk, 
and use the same classification as in Eq. (1) [1]. Note that 
for wires 1 and m, there are only three classes of patterns, 
OC, IC, and 2C. With the similar technique, the closed-form 
expressions of wire 1 (wire m) and approximate delays for 
all classes are derived and summarized in Table IV, where 
T^^ is the approximate delay for iC pattern. For wire 2 (wire 
m — 1), we consider wire 1 to the left and wires 3 and 4 to 
the right (wires m — 3, m — 2 to the left and wire m to the 
right) for crosstalk. Similarly, the closed-form expressions of 
wire 2 (wire m — 1) and approximate delays for all classes 
are derived and summarized in Table V, where T^2 is the 
approximate delay for iC pattern. 

E. Revised models with consideration of the buffer ejfects 

In the previous derivation, the buffer effects are ignored with 
assumption that the driver resistance and loading capacitance 
are relatively small. In practice, the values of resistance 
and capacitance vary with different structure of buffers. In 
this work, we consider drivers and receivers implemented 
as a non-inverting inverter chain. The simplest one has two 
chained inverters. The loading capacitance Cl and driver 
resistance Rs are due to the first and last stage inverters 
in the chain, respectively. The buffer strength is measured 
by the normalized size of inverter to the smallest inverter. 
For global interconnects in submicron technology, the loading 
capacitance is not significantly large in comparison with that of 
interconnect. According to [28], for a 45nm technology [29], 
the loading capacitance Cl induced by a 100 times inverter is 
given by 25 fF. In this paper, we consider loading capacitance 
as large as 100 fF. For significantly large Cl, the delay due to 
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Cl would dominate the total propagation delay and all classes 
of patterns would collapse into one class. In the following, we 
revise our models to capture the buffer effects of Rs and Cl 
at the inputs and outputs of the interconnects, respectively. 

First, we focus on our three- wire model. With consideration 
of buffer effects, the differential equation is still given by 
Eq. (2). Only the boundary conditions need to be changed. 
The revised boundary conditions are given by 

Vf-{Vi'-V/)u{t)~I,{0,t)Rs 
fori = 1,2,3 






CL§-ML.i) 



By solving the differential equations of a three-wire bus, we 
derive the expressions of all worst-case patterns as shown in 
Table VI. The revised delay expressions are listed in column 
five of Table VI. Note that the revised three-wire delay model 
would reduce to that in Table I, when the driver resistance and 
loading capacitance are relatively small, Rt == and Ct = 0. 

Similarly, for a five-wire bus, we derive the expressions 
of all worst-case patterns in each class as shown in Ta- 
ble VII. The ratio between T2 and T3 is given 

(l+^\)RC(RTC^+RT + C:^ + (if) _j_ 1 



(l+3A)i?:C(flrC^+flT+cJ+(|)2) 

can be solved by assuming e" 



by ^ = 



Then the 50% delay 

t_ 

(e ^3 ) . The revised 



TABLE VIII 

Bus PARAMETERS IN A 45nm TECHNOLOGY. 



Parameters | 


L 


5 mm 


r 


13.75 f7/mm 


w 


0.8 fim 


I 


1.736 nH/mm 


s 


0.8 ^m 
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8.263 fF/mm 


t 


2 fim 


Cc 


101.136 fF/mm 


h 


4.82 ^m 


Rs 


100 f7 


KlLD 


2.5 


Cl 


OfF 



Metal 10 



delay expressions are listed in column six of Table VII. Note 
that the revised five-wire delay model would reduce to that in 
Table III, when the driver resistance and loading capacitance 
are relatively small, Rt = and Ct = 0. 

According to the delay expressions in Tables VI and VII, 
both driver resistance and loading capacitance tend to increase 
the delay. When the loading capacitance increases, the delay 
difference among all classes diminishes. For extremely large 
Cl, the delays for all classes are close and the classification 
becomes inconsequential. 

F. Characterization of the delay of a multi-wire bus 

In the derivation of our five-wire model above, we focus 
on the worst-case patterns of the middle wires only. We also 
derive delay models for boundary wires. In the following, 
we show that our five-wire model can be easily applied to 
approximate the delays of an 771-wire bus {m > 5). First, we 
use our five-wire delay model as a shift window to scan the 
internal wires (wires 3 through m — 2) to identify the longest 
delay. Then, for boundary wires (wires 1,2, m — 1, and m), we 
use the models in Tables IV and V for delay approximation. 
Hence, the delay of an ?Ti-wire bus is given by the largest delay 
among all wires. For example, for a pattern titiii of ^ six- 
wire bus, the classes for wires 1 through 6 are given by 2C, 
AC, AC, 2C, OC, and OC, respectively. Thus, the worst-case 
class is given by AC. According to our models in Tables III, 
IV, and V, the worst-case delay is given by the larger one of 
the two delays 6.540(1 + (2 - V2)A)t and (in |f ) (1 + 3A)r. 

The proposed analytical delay models target two important 
applications. One primary application of our model is the de- 
sign of crosstalk avoidance codes (CACs). Since our proposed 
models provide more accurate delays for different transition 
patterns than previous models, we can identify unwanted 
patterns more effectively. Second, our models can be applied to 
partial coding schemes, where buses are broken into sub-buses, 
since our models are more accurate for a bus of small size. 
To incorporate such analytical delay models in EDA softwares, 
such as a typical timing analysis flow, appropriate adjustments 
are needed. We plan to investigate this important scenario in 
our future work. 

G. Discussion on synchronization problems 

In previous subsections, we assume simultaneous transitions 
on all the wires. However, for global buses where buffer 
insertion techniques are usually used to reduce their delay 
[20], simultaneous signal transitions on the bus cannot be 




Fig. 2. Interconnect structure. 



guaranteed. Our derived models do not work for buses with 
synchronization problems. In the following, we briefly discuss 
the synchronization problems and conclude with insights on 
the delay changes of interconnects with synchronization prob- 
lems and impacts on the CAC designs. 

Based on our three-wire and five-wire models, we observe 
two possible scenarios with regard to the impact of synchro- 
nization problems on the delay. When the time differences 
are relatively small, the delay is increased only by the time 
differences. When the time differences are sufficiently large, 
they can change the worst delay of a class to a different class. 
For instance, the delays of the transition patterns in OC and 
IC may be increased to those of 2C when the time differences 
are large enough, and similarly 2C to 3C. This is consistent 
with the observation in [17]. On the other hand, the delays 
of the transition patterns in 3C may decrease to those of 2C 
class when the time differences are large enough. Intuitively, 
this is because large time differences change the intended 
transition patterns into different patterns. As observed above, 
depending on the severity of the synchronization problems, 
the effectiveness of CACs is affected to a varying extent. 
Furthermore, the sensitiveness to time differences varies with 
CACs. 

III. Performance evaluation 

We evaluate the performance of our delay models, and 
compare it with that of the model in [1] in three scenarios. 
First, since our delay models focus on three and five adjacent 
wires, we consider three- and five-wire buses. This scenario 
is also motivated by partial coding schemes (see, e.g., [12], 
[13], and [14]), which divide a wide bus into sub-buses with 
a few wires and separate them by shielding wires. The second 
scenario is buses with more than five wires. We have run 
extensive simulations on buses with an odd number of wires 
(up to 33 wires). Our conclusions are the same regardless of 
the number of wires. For brevity, we present our simulation 
results for 17- and 33-wire buses. In the first two scenarios, 
we focus on the worst-case delays of the middle wires. To 
characterize the whole bus transitions, our five-wire model 
can be applied to all wires to approximate their delays with 
higher accuracy. In the third scenario, we assume the transition 
patterns are limited to those of CACs and consider the worst- 
case delays for all wires of an 8-wire bus. 



TABLE IX 

Comparison of simulated delays, delays of our three- wire 
model and the model in [1]. all the delays are in ps. 



TABLE X 

Comparison of simulated delays, delays of our five- wire model 

AND THE model IN [ 1 ] . ALL THE DELAYS ARE IN ps. 
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All the simulation results in this paper are obtained by the 
following setup. The simulation is based on a 45nm technology 
with 10 metal layers [29]. The global buses are routed in the 
top two metal layers, 10 and 9, with a ground metal layer 8 
down below as shown in Fig. 2. We consider metal layer 10 for 
all buses, since the crosstalk is more serious than that of metal 
layer 9. The bus parameters are obtained by structure 1 in [30] 
and summarized in Table VIII, where Ku^u is the permittivity 
of the dielectric between metals. Since the model in [1] does 
not account for the loading capacitance, we assume C^ = 
fF for simulations in comparison with the model in [1]. We 
also simulate 17- and 33-wire buses with Cl = 100 fP, which 
represents the loading capacitance induced by a 400 times 
inverter The coupling factor is given by A = — = 12.2. For 
inputs with U = 10 ps, inductance effect is negligible when 
1.3 mm < L < 66.7 mm. All the buses for simulation have a 
length of 5 mm and the inductance effect is not considered in 
this work. The buses are divided into 100 sections as shown in 
Fig. 1 to characterize the distributed RC model. The simulation 
results are obtained from HSPICE. 

A. Three-wire and five-wire buses 

For a three-wire bus, the simulated delays are compared 
with the delays by our model and the model in [1] for all 
classes in Table IX, where Td denotes the simulated worst 
delay of wire 2, Tg"-^ the approximate delay for iC pattern 
by our three-wire model, and T2 by the model in [1]. The 
error percentages of our model and the model in [1] are also 
shown in Table IX. For all five classes of transition patterns, 
the maximum and minimum errors by our model are only 
3.14% and 0.47%, respectively, as opposed to 891.90% and 
34.38% by the model in [1], respectively. As Table IX shows, 
our model is much more accurate than the model in [1] for 
all patterns in a three-wire bus. We remark that the delay by 
our model for the IC pattern, (in ^) t, does not depend on 
A. 

For a five-wire bus, the worst delays of all classes of 
transition patterns based on our five-wire model are compared 
with those of the model in [1] as well as the simulated delays 
by HSPICE in Table X, where Td denotes the simulated worst- 
case delay of wire 3 for all iC patterns, T^*^ the approximate 
delay for iC pattern by our five-wire model, and Tg by the 
model in [1]. The error percentages of our model and the 
model in [1] are shown in Table X. For a five-wire bus the 
maximum and minimum errors by our model are 34.41% and 
1.59%, respectively, in comparison to 84.28% and 16.50% by 



the model in [1], respectively. As Table X shows, our five-wire 
model is more accurate than the model in [1] for all patterns 
in a five-wire bus. In particular, although the delays in the 
model in [1] were claimed to be upper bounds on the actual 
delays, our simulation results in Table X show that this claim 
is invalid for the OC patterns. In [17], the author proposed a 
method which achieves a delay of tq by surrounding each data 
wire with two shield wires with the same transition. Since the 
transition patterns for each data wire are always in DC class, 
the delays of the data wires are tq according to the model in 
[1]. In contrast, the delay for the data wires can be as large 
as 0.165(1 + 3A)to by our model; When A is large, the model 
in [1] severely underestimates the delay, while our model is 
more accurate. 

B. 17 -wire and 33-wire buses 

We next compare our five-wire model with the model in 
[1] for 17- and 33-wire buses. With a 17-wire bus, we focus 
on the middle wire (wire 9). We still classify the transition 
patterns according to the transitions of the middle three wires 
(wires 8, 9, and 10). Since it is time consuming to identify 
the transition patterns with the longest delay in each class, we 
make one assumption about the patterns with the longest delay 
in each class. For any two wires symmetric to wire 9 (wire 
i and wire 18-J, i E {1,2,- •• ,8}), there are nine possible 
patterns, tt, ||, - -, t-, -t, 1-, -|, tl, and |t- For patterns in 
opposite direction, we assume the influences of the two wires 
will cancel out because of symmetry. For other patterns, if the 
upward transition of one wire increases the delay, we see that 
tt has greater delay than '[- or -^. Similarly, if the downward 
transition increases the delay, the pattern H has greater delay 
than 4,- or -J,. So we assume that the longest delay happens 
when two symmetric wires have either t^ or Xi transitions. 

Based on this assumption, we search all possible symmetric 
transition patterns to find the worst-case patterns in each class, 
which are listed in the second column of Table XI, where the 
pattern on wires 8, 9, and 10 are shown in the parenthesis. 
The simulated worst-case delays for all iC, denoted by Td, 
are compared with the delays by our five-wire model and the 
model in [1] in Table XL The error percentage of our model 
and the model in [1] are also shown in Table XL For all five 
classes, the maximum and minimum errors by our model are 
only 45.10% and 5.66%, respectively, as opposed to 86.84% 
and 8.89% by the model in [1], respectively. For all classes 
except IC, our five-wire model outperforms the model in [1]. 
The model in [1] also has a large error percentage for OC. 



TABLE XI 
Comparison of simulated delays and delays given by our five-wire model and the model in [ 1 ] for wire 9 in a 1 7-wire bus with 

Cl = fF. All the delays are in ps. 
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TABLE XII 

Comparison OF simulated delays and delays given by our five- wire model and [1] for wire 17 in a 33-wirebus withCl 

the delays are in ps. 
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TABLE XIII 
Comparison OF simulated delays and delays given by our five- wire model focusing on the middle wire in a 17-wire and a 33-wire 

buses with C'l = 100 fF. All the delays are in ps. 
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With a 33-wire bus, we focus on the delay of the middle 
wire (wire 17). Since there are 3^"^ transition patterns, it 
is infeasible to search all possible symmetric transitions as 
before to find the worst-case patterns. We make the following 
three assumptions: (1) The worst pattems in each classes are 
symmetric; (2) The closer the wire gets to the middle wire, 
the greater the coupling on the settling of the middle wire; (3) 
We initialize the middle three wires to a pattern in iC, and 
initialize all other wires with opposite transitions to the middle 
wire. Based on these three assumptions, we use Alg. 1 to find 
the pattems with largest delays. We denote by Pi the updated 
transition pattem of an m-wire bus after the i-th iteration 
of Alg. 1, where m is odd. Alg. 1 can greatly reduce the 
simulation time for identifying the worst-case patterns. For 
instance, the worst-case patterns for an 33-wire bus can be 
identified by simulating only 5 x 15 = 75 transition pattems. 

We note that the one assumption about the worst-case 
pattems for 17-wire buses and the three assumptions about 
33-wire buses are made in order to reduce the complexity 
of finding the worst-case patterns. We did verify our three 
assumptions about 33-wire buses over 9- and 11-wire buses: 
the worst cases for all the classes based on Alg. 1 are indeed 
the worst cases by exhaustive search. This also verifies the 
assumption for 17-wire buses, since it is one of the three 
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Require: ?7i-wire bus; 

Initialize: Pq is initialized with transitions opposite to wire 

2i±i, except for wires 2i_^^ I2±i^ and 2i±3; 

i:=0; 

repeat 

for j = ^ to 1 do 

Flip the transition of wires j and (m + 1 — j) in P^; 
if the delay of wire ^^^^ increases then 

Keep the changes; 
else 

Reverse the changes; 
end if 
end for 
i = i + l; 

Update Pi with the current pattem; 
until P,_i = P, 
return Worst-case transition pattern for wire ^^^^; 



assumptions for 33-wire buses. For instance, the worst-case 2C 
pattern of a 1 1 -wire bus is given by ttt|-t4ttt with exhaus- 
tive search. In Alg. 1, starting from mi-t-im, the worst- 
case pattern is found via the order: |m-t-|m =^ llt|- 
t-itii =^ itti-t-itti => ttti-t4ttt- Unfortunately, it 
is difficult to verify Alg. 1, even for one case, for 17- or 33- 
wire buses, because the complexity would be prohibitive. For 
instance, for each class focusing on the middle wire, there 
are 3^"^ — 4782969 possible patterns for a 17-wire bus (and 
3^" = 2.06 X IQi'^ for a 33-wire bus), and it takes about 166 
days to simulate these cases. 

The worst transition patterns for each class in a 33-wire 
bus, with respect to the three assumptions above, are listed in 
the second column of Table XII, where the pattern on wires 
16, 17, and 18 are shown in the parenthesis. The simulated 
worst-case delays of wire 17 for all iC patterns, denoted by 
Td, are compared with the delays of our five-wire model and 
the model in [ 1 ] . The error percentages of our model and the 
model in [1] are also shown in Table XII. The maximum and 
minimum errors by our model are only 45.23% and 5.95%, 
respectively, in comparison to 86.87% and 7.61% by the model 
in [1], respectively. Again, for all classes except IC, our five- 
wire model outperforms the model in [1]. The model in [1] 
also has a large error percentage for OC. 

Since our revised models also account for the loading capac- 
itance, we also simulate 17- and 33-wire buses with Cl = 100 
fF, which represents the loading capacitance induced by a 400 
times inverter The simulated worst-case delays of the middle 
wire for all iC patterns, denoted by Td, are compared with the 
delays T^'-^ by our five-wire model as shown in Table XIII. The 
error percentages of our model are also shown in Table XIII. 
The worst-case patterns are obtained via Alg. 1. The worst- 
case patterns are different from those in Tables XI and XII 
due to the varying of the loading capacitances. However, our 
five-wire model can still approximate the delays with similar 
error percentages as those in Tables XI and XII. 

Finally, we remark that the longest delays for each class in 
Tables XI and XII are approximately the same for both 17- 
and 33-wire buses. Based on the simulation results of 17- and 
33-wire buses, we conjecture that our five-wire model would 
be more accurate than the model in [1] for buses with any 
number of wires. 



C. Performance of CACs 

In the simulation results above, we assume the transition 
patterns are arbitrary. Herein, we assume the transition patterns 
are limited to those of CACs. We evaluate the performance 
of our delay model for three families of CACs [12]-[14]: 
one Lambda codes (OLCs), forbidden pattern codes (FPCs), 
and forbidden overlap codes (FOCs). Based on our five-wire 
model, the worst delays of aforementioned CACs are shown 
in Tables III, IV, and V. Based on the model in [1], the worst 
delays of aforementioned CACs are approximated by (1+A)to, 
(1 + 2A)to, and (1 + 3A)to, respectively. Since the number of 
transition patterns is a quadratic function of the number of 
codewords, it is time-consuming to simulate a large bus to get 
the worst-case delays on all wires. Hence, for each CAC, we 



TABLE XIV 

Comparison of simulated delays and delays given by our 

five-wire model and [1] for all wire in an 8-wire bus, where 

TJ*^, T^i , AND T^^ DENOTE THE DELAYS OF WIRES 3-8, WIRE 1 (m), 
AND WIRE 2 (m - 1), RESPECTIVELY. ALL THE DELAYS ARE IN ps. 
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simulate an 8-wire bus. The numbers of codewords of OLC, 
FPC, and FOC are given by 16, 68, and 149, respectively. 
The total numbers of transition patterns for OLC, FPC, and 
FOC are given by 240, 4556, and 22052, respectively. We 
obtain by simulation the maximum delays of each wire for 
all transition patterns. The simulation results are shown in 
Table XIV, where the delays given by our five-wire model 
and the model in [1] are also included. Intuitively, the worst- 
case delays of any two symmetric wires are the same, since the 
symmetric transition of a valid transition pattern is also valid. 
As shown in Table XIV, the simulated delays of symmetric 
wires are very close. For OLCs, FPCs, and FOCs, the largest 
delays are emphasized in boldface. As Table XIV shows, our 
delay models are more accurate than the model in [1] for all 
three families of CACs. 

IV. Conclusions and future work 

In this paper, we propose improved analytical delay models 
for coupled interconnects. We first derive closed-form expres- 
sions of the signals on the bus, based on the distributed RC 
model, and then approximate the delays of different patterns by 
evaluating these closed-form expressions. We focus on three- 
wire and five-wire models, and simulation results show that 
our model has better accuracy than the model in [ 1 ] . Although 
our models are based on three-wire and five-wire buses, they 
are not limited to these two cases. For a bus with more than 
five wires, our five-wire model can still approximate delays 
better than the model in [1]. 
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